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An improved method for the description of hierarchical complex systems by means of a Fokker- 
Planck equation is presented. In particular the limited-memory Broyden-Fletcher-Goldfarb-Shanno 
algorithm for constraint problems (L-BFGS-B) is used to minimize the distance between the numer- 
ical solutions of the Fokker-Planck equation and the empirical probability density functions and thus 
to estimate properly the drift and diffusion term of the Fokker-Planck equation. The optimisation 
routine is applied to a time series of velocity measurements obtained from a turbulent helium gas 
jet in order to demonstrate the benefits and to quantify the improvements of this new optimisation 
routine. 

PACS numbers: 02.50.Ey, 05.45.Tp 



I. INTRODUCTION 

Most complex systems can be assigned to the two fol- 
lowing classes, the time dependent coniplex systems and 
the scale dependent complex systems [Y\. Examples for 
the first class are nonlinear chaotic dynamical systems, 
while systems with a scaling behaviour over a wide range 
of different scales, like turbulence, financial markets or 
earth quakes are examples for the second class. Besides 
the characterisation of new features of these complex 
systems it is a challange to derive effective underlying 
equations for their description. A successful approach to 
such systems is a description through stochastic equa- 
tions (Langevin or corresponding Fokker-Planck equa- 
tions) which may involve nonlinearity in the determin- 
istic as well as in the stochastic part. This approach 
has become particulary interesting, as it has been shown 
that it is possible to estimate the underlying stochastic 
equations directly by data analysis. 

The verification of the preconditions and the appli- 
cation of this approach to time dependent systems has 
been described in 0, H, 0, H, @| . It was successfuly ap- 
plied to the description of noisy electrical circuits [J, 
systems with feedback delay J3, traffic flow data [1] and 
physiological time series 0, [lOl, to mention just a few. 
Also the second class, the scale dependent complex sys- 
tems, which in general are not stationary in scale, can 
be analysed succesfuUy by this approach. In this class, 
stochastic processes evolving in scale are reconstructed. 
A complete statistical description, i.e. general n scale 
joint statistics, for certain classes of systems, such as the 
roughness of surfaces [ll|, , turbulence [ij, [3, and 
finance [l^ [13, can be obtained. Though in general 
a reconstruction of time series for the scale dependent 
complex systems is not possible in such a simple way as 
for the first class, certain promising attempts have been 
made 

The use of Langevin and Fokker-Planck equations is 
therefore a very promising method for time series anal- 



ysis. The critical part in this method is the correct es- 
timation of the coefficients of the Langevin or the cor- 
responding Fokker-Planck equation, which are the so 
called Kramers-Moyal coefficients. A correct estimation 
of these coefficients is crucial to a good description of the 
underlying processes. The estimation of the Kramers- 
Moyal coefficients is complicated by the fact, that the 
approach itself is based on the assumption of Markov 
properties. This assumption is valid for many systems 
for big and small but finite timesteps. The main diffi- 
culties arise from the fact that for the estimation of the 
underlying equations it is necessary to calculate the limit 
of infinitely small time steps, where the Markov proper- 
tics arc often no lon ger valid. For more details concerning 
this discussion see l[20l [2ll [2^ . Further concerns about 
systematic estimation problems were discussed in [23| . 
Due to these problems in the estimation process it was 
necessary till now to apply manual corrections ^vf\ to the 
determined Kramers-Moyal coefficients in some cases in 
order to get an optimal description. For time dependent 
systems these problems were adressed in by propos- 
ing an improved estimation method for the necessary pa- 
rameters. This improved estimation method utilises the 
comparison of the probability density functions (pdfs) 
generated by the Langevin equation and those computed 
directly from the empirical data. 

In this work we address the crucial problem of the cor- 
rect estimation of the coefficients for the Fokker-Planck 
equation for the second class of systems with scale de- 
pendent complexity. It should be noted, that there is no 
principle problem to transfer the results of this work to 
the class of time dependent complex systems. Further- 
more, the methods proposed in this work can be regarded 
as a systematic way to include the manual corrections 
described for example in [17]. In general for the class of 
scale dependent complex systems the processes are not 
stationary in scale and so it is often necessary to use 
numerical instead of analytic solutions. Therefore the 
knowledge of the solution to the Fokker-Planck equation 
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will not be global but point wise in the space spanned 
by the coefficients of the Fokker-Planck equation. In or- 
der to utilise the comparison of the pdfs, as it has been 
suggested for the first class of complex systems [24] , opti- 
mization routines are proposed to find the optimal set of 
parameters, which implies the best agreement between 
the numerical solutions of the Fokker-Planck equation 
and the pdfs computed directly from the data. 

In detail, in section [ll] the basic features for stochastic 
processes evolving in scale are discussed. A description 
of the optimisation routines is given in section IIIII First 
results for turbulence data are shown in section IIVI and 
new insights are pointed out for this type of data. Fur- 
ther applications of the discussed methods are shown in 
section |Vl Wc finish with some concluding remarks in 
section IVTl 



II. FOKKER-PLANCK-EQUATION 

We start with a situation, where for a complex system 
some amount of data x{t) is given. Here x denotes the 
describing quantity, such as heights for surfaces or veloc- 
ity for turbulent fields and t denotes a time or a space 
variable. For simplicity we assume that a; is a one dimen- 
sional quantity, noting that higher dimensional systems 
can be treated in a similar way ^25|. The scale depen- 
dent features are described by y{t,T), where r denotes 
the selected scale and y a quantity describing the disor- 
der (complexity) of x(t) in a r-neighbourhood. y may 
be a wavelet, a local roughness or any other local quan- 
tity (see for example [26j). Here we define y as a simple 
increment 



y{t,T) x{t + r) - x{t). 



(1) 



In order to obtain a statistically complete description of 
the system with respect to y, the joint probability density 
function p(yi,Ti, ...,y„,r„) of y^r) at different scales r^, 
has to be known. The joint pdf is constructed from the 
set of yi{Ti) obtained at the same t value. In the following 
the joint statistics of these increment processes are con- 
sidered. Because of the involved scales the dimension of 
the joint probability density can be very high. Therefore 
it is in general very difficult to compute this joint pdf 
from empirical time series. However the description and 
computation can be highly simplified if Markov proper- 
ties hold. This is the case if 

p{yt,n\yt+i,n+i,...,yn,Tn) = p{y^,n\y^+l,n+l) (2) 

is true for all i and n > i. Without loss of generality 
we take n < n+i. It should be noted that the Markov 
property can be tested for a given data set [1, [13, [13, [13, 
l28j . For valid Markov properties the joint probability 
density can be substantially simplified: 



Piyi,ri, 



,yn 



(3) 



Because the conditional pdfs of first order (the right side 
of Eq. ([2])) provide a complete description of a Markov 
process, they are the basic quantity to measure the cor- 
rectness of the description of a Markov process. This 
issue and the importance of using conditional pdfs and 
not unconditional pdfs for the verification of the esti 
mated process are discussed in f\A 
The idea proposed in [13, 14, 17 



16,1121,1291. 

30| is to model these 



conditional pdfs of ffist order with a Fokker-Planck equa- 
tion [31I, [13] evolving in scale. 



■■T^p{y,T\yo,To) 



-|i^«(y,r) + ^D(^)(,,.) 



(4) 



piy,T\yo,TQ). 



Note that, in contrast to the usual definition of the 
Fokker-Planck equation, here both sides are multiplied 
by r. This corresponds to a logarithmic length scale as 
used in (30j . This choice is convenient for analysing frac- 
tal scaling features, but does not imply any loss of gener- 
ality. L>(i)(a;,T) and D''^\x,t) are the drift or diffusion 
coefficients, respectively, and are defined as 



DW(x,r) 



(5) 
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{x' — xyp{x', T — At\x, T)dx' . 



It should be noted that the conditional pdf 
p{x' ,T — At\x,t) can be estimated directly from 
the data, and therefore the Kramers-Moyal coefficients 
can be determined by using Eq. ([5|). To see the validity 
of the Fokker-Planck ansatz the size of the 4th order 
Kramers-Moyal coefficient can be estimated, for further 
details see [13, [H, [Ml . 



III. OPTIMISATION 

For the optimised estimation of the Kramers-Moyal 
coefficients an implementation of the limited-memory 
Broyden-Fletcher-Goldfarb-Shanno algorithm for con- 
straint problems (L-BFGS-B algorithm) [33., [S [s^ in 
R [36] is used, which is described in the appendix in 
detail. The starting point is the approximation of the 
Kramers-Moyal coefficients determined by the evaluation 
of Eq. The coefficients are approximated by func- 



tions with free parameters q. 



U) 



D^'\y,T)^g{y,T,q^^\...,q^^^). 



(6) 
(7) 



p(2/i,n, \y2,T2) 



■piVn 



-iWr, 



i) -piVn 



Solving Eq. (g]) as proposed in [T3] by using D^^'^{y,T) 
and D^'^'>{y,T) as the drift and diffusion coefficient re- 
spectively, leads to a conditional pdf of first order 
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Pnum(j/j-i,T,_i|?/i,ri,g^^\...,g£\g^^\...,g^^^). In order 
to maximise the agreement between these numerical so- 
lutions Pnum of Eq. (dl and the pdfs from the empiri- 
cal data, a measure is needed. Here, the weighted mean 
square error in logarithmic space is used, which is defined 
as 

/ dr {pn + pre/)(lnp„ - Inpref)'^ 

dM{pn,Pref) ■= y— 7 — ^ — ^ (8) 

J dr [Pn -I- Pre/) (In Pn + In Pre/) 
R 

Here R denotes the subspace, where an estimate of pref 
from empirical data is possible and pref > 0. p„ and Pref 
are joint probabilities of second order Tj„i; y^, t^) 

which are obtained from the conditional probabilities by 
the multiplication with the empirical probability density 
p{yi,Ti). The L-BFGS-B algorithm minimizes the non- 
linear function dM{Pn,Pref, q) under constraints for each 
component of q, which may be denoted as L < q <U . q is 
the Nq ~ fh + h dimensional vector of all the parameters 

qQ^\ . . . , ql^\ qj^\ . . . , q'^'' that are necessary to determine 
the functional form of the first two Kramers-Moyal coef- 
ficients for a given scale r. L and U represent the lower 
and upper bound on q, respectively. 

IV. RESULTS FOR TURBULENCE 

The procedure described above, is now applied to ex- 
perimental data. The data considered were obtained 
from a cryogenic axisymmetric helium gas jet at a 
Reynolds number of 7.6T0^. The data set contain 1.6T0^ 
measurements of the velocity in the center of a free jet, 
where the distance between the anemometer and the noz- 
zle was 40D and the diameter D of the nozzle was 2 mm. 
For further details we refer to [13] ■ This high Reynolds 
number data set have the benefit, that the region between 
the Markov-Einstein coherence length and the inte- 
gral length spans a large interval of scales. This is impor- 
tant because below the Markov-Einstein coherence length 
Eq. ([4]) cannot be applied due to the missing Markov 
properties. Above the integral length the properties of 
the turbulent cascade are not present any more. 

A first approximation of the Kramers-Moyal coeffi- 
cients is determined by means of their definition in 
Eq. (O. Then these first estimates are used to recon- 
struct the conditional probability density. In order to 
assess the quality of the solution the distance dM be- 
tween the probability density of the data on a certain 
scale T and the reconstructed probability density by the 
estimated process equation is calculated. Thereby , the 
distance dM{Pn,Pref) given in Eq. ^ is used. Now the 
iterative algorithm, described in the appendix is used to 
minimize dMiPmPref) with respect to q. 

The optimisation, for the turbulence data used here, 
is performed for each value of the scale individually. In 
order to do this, the range of scales is divided into small 
half-open intervals [t^s, Tie[. Thus we apply here a piece- 



wise constant approximation to the scale dependent pro- 
cess. The optimisation is performed independently for 
each of these intervals, with a parametrisation of the 
Kramers-Moyal coefficients that is no longer dependent 
on the scale, as was found in our previous work [14, i29j. 
The first estimate of the Kramers-Moyal coefficients can 
be fitted with a polynomial of first order for the drift coef- 
ficient and a polynomial of second order for the diffusion 
coefficient. Therefore the Kramers-Moyal coefficients are 
parameterised as 

^W(y,r) = qi'^+q['^y (9) 

D^'Hy,r) = g^^'+gj^^y + gf^y^ (10) 

For many applications such a parametrisation of the 
Kramers-Moyal coefficient with polynomials seems to be 
a good choice. 

(i) 

Though working with constant coefficients qf consti- 
tutes an approximation it has two advantages. The first 
is that the smoothness of the resulting functions qf with 
respect to r provides a first assessment of the robustness 
of the optimisation process, if we assume the true coeffi- 
cients to be smooth functions with respect to the scale. 
The second and more important advantage is, that the 
number of variables Nq, or in other words, the dimen- 
sion of the space where the optimisation takes place, is 
smaller. This is important, because the optimisation in 
a lower dimensional space can be much faster than one 
in a high dimensional space. In addition the number of 
local minima may increase rapidly with the addition of 
more variables and therefore the localization of the global 
minimum becomes more difficult. 

The optimisation is performed for scales ranging from 
TMarkov to 80750 Sample steps, where TMarkov denotes 
the Markov-Einstein coherence length, which is in this 
case 8 sample steps. The integral length for this data set 
is 715 sample steps. The scale intervals are chosen here 
in such a way, that 

Tie = max ^^-jT^s + TMarkov^ , ( 1 1 ) 

where r^s denotes the left and Tie the right border of 
the scale interval. The initial estimate of the Kramers- 
Moyal coefficients by means of their definition in Eq. ([5|) 
can only be performed for a scale larger than t^, where 
Tc > TMarkov In our casc Tc = 5 ■ TMarkov This is due 
to the procedure to perform the limit in Eq. ([5]) numer- 
ically, for more details see e.g. [l3|. Therefore as initial 
estimates the values of the Kramers-Moyal coefficients 
at the scale Tc are used for scales t < Tc- The limit in 
Eq. ([5]) is calculated without the use of possible refine- 
ments in order to test the robustness of the optimisation 
routine. The boundaries L and U are set in a simple way, 
to prevent D^^^ (y, r) from becoming negative. The aver- 
age number of iterations before the optimisation stopped 
was around 25. 

The values of the distance measure between the pdfs 
of the original data and the reconstructed ones using the 
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t in sample steps 

FIG. 1: Distance measure (Im between the pdfs of the original 
data and the reconstructed ones using the numerical solutions 
of the Fokker-Planck equation determined by the initial esti- 
mate of the Kramers-Moyal coefficients (open symbols) and 
optimised estimate of the Kramers-Moyal coefficients (black 
dots). The dotted line provides the expected distance, if both 
distributions have been produced by the same process. 
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FIG. 2: Conditional probability density p(j/(r — 1817)|j/(r = 
2272)) of given data (unbroken lines) and reconstructed by 
the numerical solution of the Fokker-Planck equation (dot- 
ted lines) using the initial estimates of the Kramers-Moyal 
coefficients. 



initial estimates of the Kramers-Moyal coefficients are 
displayed in Fig. [T] as open symbols. Three ranges can 
be identified. The first range spans from TMarkov to Tc- 
Here the limit could not be calculated in a proper way 
and constant initial estimates of the Kramers-Moyal co- 
efficients have been used, resulting in a nearly constant 
distance measure in this range. The second range spans 
from Tc to Top, where Top is around 300 sample steps. In 
this range the distance measure decreases monotonically 
with increasing scale r. This may be due to a better de- 
scription of the data with increasing r, or due to a better 
performance of the initial estimate of the Kramers-Moyal 
coefficients, or due to both. In the third range the dis- 
tance measure increases after the minimum at Top, which 
marks the border between the second and the third range. 

Performing the optimisation routine described above, 
the distance measure between the pdfs of the original 
data and the reconstructed ones using the optimisation 
routine is obtained. The distance measure for the op- 
timised pdfs is displayed in Fig. [1] as black dots. For 
very small scales t < Tc the distance measure remains 
constant or increases slightly. For a very broad range of 
scales the distance measure then declines monotonically, 
until it saturates for very large scales. Interestingly a 
scale r ~ Top exists, where the distance function has ap- 
proximately the same value for the initially estimated 
coefficients and the optimised one. This indicates that 
for Top the initial estimate of the Kramers-Moyal coeffi- 
cients is already optimal. We obtained similar results for 
other data sets. 

In order to assess the significance of the results the in- 
trinsic error is estimated. The data set is divided in sub 
sets and the distance between the distributions belonging 
to the corresponding sub sets is calculated. This is done 
for different sizes of sub sets and then extrapolated to 
obtain the intrinsic error for the whole data set. As seen 
in Fig. [1] the intrinsic error is still smaller for scales up to 
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FIG. 3: Conditional probability density p(j/(r — 1817)|j/(r = 
2272)) of given data (unbroken lines) and reconstructed by 
the numerical solution of the Fokker-Planck equation (dotted 
lines) using the optimised estimates of the Kramers-Moyal 
coefficients. 



lO'* than the distance measure for the optimised coeffi- 
cients, nevertheless to our interpretation the magnitude 
of the distance measure is with lO"'^ sufficiently small, 
see Fig. [2] and Fig. [3] for an example. 

The graphs for the optimised coefficients q[^\ q^^ and 
q^^ are shown in Fig. 2]- [51 For q^^^ and g^-* the initial 
estimates as well as the optimised values are essentially 
equal to zero. This result has an interesting physical 
context. It has been shown that for a higher dimensional 

(2) 

analysis a corresponding non- vanishing q\ term violates 
the second von Karman equation (see Eqs. (31) and (32) 
in 25]). In Fig. [5] two ranges can be identified. For 
smaller scales r < r^, , where Tg is around 2000 sample 

(2) 

steps, takes non-vanishing positive values, while it 
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FIG. 4: The parameter The initial estimate is denoted 

with white circles while the optimised one is denoted with 
black circles. The grey line shows the results for centered 
increments using optimised coefBcents. 
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(2) 

FIG. 5: The parameter . The initial estimate is denoted 
with white circles while the optimised one is denoted with 
black circles. The grey line shows the results for centered 
increments using optimised coeflicents. 

becomes zero for scales r > t^. The other non- vanishing 

(2) 

term of the second Kramers- Moyal coefficient, , in 
Fig. [51 exhibits in the same region a power-law behaviour 

which saturates for larger scales. The same is true for (/J^' 
in Fig. 21 but with a much higher accuracy. The exponent 
for gj^-* in this region is 0.069. 
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FIG. 6: The parameter • The initial estimate is denoted 
with white circles while the optimised one is denoted with 
black circles. The grey line shows the results for centered 
increments using optimised coeflicents. 



FIG. 7: The second and fourth moment of the data set. 

For the interpretation of we note, that §2 repre- 
sents the multiplicative noise in the system. By investi- 
gating the moments of the system, which are also called 
structure functions, in Fig. [71 it can be noted that the 
moments start to saturate at a scale, which is compara- 
ble to Tg. This indicates that Tg is related to the integral 

(2) 

length of the system. Further ^2 — for t > Tg is in 
agreement with a Gaussian shape of the pdf of velocity 
increments for large scales. 

V. APPLICATIONS 

The method above provides a much better answer to 
the central question of determining the correct Kramers- 
Moyal coefficients. But besides this it enables us to dis- 
cuss further important questions arising from the descrip- 
tion of scale dependent systems with a Fokker-Planck 
equation. The first of these questions is the optimal in- 
crement definition for the stochastic process as given by 
Eq. We started our analysis using the left -justified 
increments, which are more common in the literature. 
Using left-justified increments means, that the smaller 
increment is nested inside the larger increment and that 
both increments have the left endpoint in common. For 
certain classes of systems this may introduce additional 
correlations between the increments that are not desired 
. Thus it has been proposed to use centered increments 
instead of left-justified ones 

y{t,T):^x{t+^^ -x{t-^y (12) 

It is now possible to investigate the improvement of the 
description of the system by using centered increments. 
As a criterion we use the distance between the numer- 
ical solution of the Fokker-Planck equation, where the 
coefficients have been optimised, and the empirical pdf. 
As can be seen in Fig. [51 the distance measure exhibts 
smaller values for centered increments with the excep- 
tions of very small and very large scales. This indicates 
that the description of this special system indeed can be 
improved using centered increments. Reanalysing the co- 
efficients that are shown in Fig.[31-[51no principal changes 
are found for the use of centered increments. In Fig.[31-[ni 
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Therefore it may be inferred that a further improve- 
ment of the description of this system may be provided 
by using an appropriate increment definition rather than 
adopting the assumption of asymmetric Kramers-Moyal 
coefficients. The first provides a better description of 
the system by using fewer free variables compared to the 
second. Another questions that may be answered in such 
a fashion is the use of higher order Kramers-Moyal co- 
efficients, especially the fourth order coefficient because 
of its importance for the application of the theorem of 
Pawula I3ll. 



VI. CONCLUSIONS 



FIG. 8: Comparison of the distance function iIm for different 
settings. The dottetd line provides the expected distance, if 
both distributions have been produced by the same process. 



additionally the results for the centered increment anal- 
ysis are shown. The biggest change is found for q[^^ and 

Qq and for scales larger than the integral scale, which 
are less important. 

The second important question concerns the 
parametrisation of the Kramers-Moyal coefficients. 
It is now possible to determine, whether a more complex 
parametrisation of the Kramers-Moyal coefficients, for 
example by using higher order polynomials, yields a 
better description of the system. As a simple example 
to illustrate this, the question of asymmetric Kramers- 
Moyal coefficients is considered. If the initial estimates 
of the Kramers-Moyal coefficients are examined, the 
functional form appears to be asymmetric in some 
cases. This seems to be especially true for very large 
scales, where the number of independent events becomes 
smaller. In order to verify if the underlying stochastic 
process can be better described by a separate parametri- 
sation for negative and positive increments, the following 
parametrisation is chosen for the optimisation: 




q^'^+q['^y if y <0 



(2) , (2) 

% +ii y 

(2) , (2) , U) 2 



(2)„2 ify^O 



92 y 

,(2 



if 2/ > 



(13) 



(14) 



As depicted in Fig. [8] the distance function takes smaller 
values than for the original optimisation, although the 
improvement is not as large as when using centered in- 
crements. It should further be noted, that an improve- 
ment is in this case not surprising since the optimisation 
is now performed in a higher dimensional space and the 
space used for the original optimisation is a subspace of 
this second optimisation. Nevertheless this finding is in 
accordance with the proposed importance of higher odd 
order terms in the diffusion coefficient 12811. 



In this work we have shown a practical way to im- 
plement an optimisation routine to improve the de- 
scription of hierarchical systems by means of a Fokker- 
Planck equation. In order to do so, first an estimate 
of the Kramers-Moyal coefficients using their definition 
in Eq. ([5]) is calculated. This initial estimate is then 
used to solve the Fokker-Planck equation numerically and 
to obtain as a solution the conditional probability den- 
sity functions (pdfs) of first order. As a next step the 
distance between this reconstructed conditional proba- 
bility and the one obtained directly from the time se- 
ries is determined using Eq. ([5]). This procedure forms 
the basis of our optimisation routine. A parametrisa- 
tion of the initial estimate of the Kramers-Moyal coeffi- 
cients is chosen, with a specified number of variables Nq. 
The L-BFGS-B algorithm is employed to minimize the 
distance between the numerical solutions of the Fokker- 
Planck equation and the empirical pdfs by adjusting the 
free variables. The L-BFGS-B algorithm is an algorithm 
which is very effective in the case of an optimisation of 
many variables which may be constrained. Therefore the 
method proposed here will also be effective for very com- 
plex parametrisation, as long as these parametrisations 
are not misspecified. 

We applied the optimisation routine to a time series 
of velocity measurements obtained from a cryogenic ax- 
isymmetric helium gas jet. We demonstrated the benefits 
of this optimisation routine. At first it is possible to ob- 
tain values of the Kramers-Moyal coefficients for much 
smaller scales, due to the fact that it is no longer neces- 
sary to calculate a limit in scale which is the bottle-neck 
of the original Kramers-Moyal method. At second the 
optimised coefficients produce numerical solutions of the 
Fokker-Planck equation that are much closer to the em- 
pirical pdfs than those produced by the initial estimates. 
At third possible systematic errors in the classical es- 
timation routine of the Kramers-Moyal coefficients that 
have been pointed out in the literature can be avoided 
using this optimisation routine. At fourth the optimised 
coefficients show remarkable simple functional forms in 
a large scaling region, while the behaviour of the initial 
estimates is much more ambiguous. At last the results 
produced by this optimisation routine are remarkable sta- 
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ble. Independent optimisations have been performed for 
small intervals in scale bordering on each other, produc- 
ing estimates which are very smooth with respect to the 
scale. Therefore this method provides the means to de- 
termine the Kramers-Moyal coefficients with much more 
accuracy or to determine correct Kramers-Moyal coeffi- 
cients for small data sets. 

Possible applications for this refined approach have 
been shown. First the question of the appropriate in- 
crement definition has been considered. It has been 
shown that by using centered increments instead of left- 
justified ones, the description of the underlying stochas- 
tic process for our example system can be improved. 
Second the question of the optimal parametrisation of 
Kramers-Moyal coefficients in the Fokker-Planck equa- 
tion has been considered. It was shown that in our case 
an asymmetric parametrisation provides only a slight 
improvement. This aspect interests because it is di- 
rectly related to closure of the higher order moments, 
see Eq. (4.13) in [l3|. With our findings here we see that 
a perturbing linear term for the diffusion coefficient may 
have no significance; thus the reported contradiction of 
the reconstructed Fokker-Planck equation with the sec- 
ond Karman equation seems to have no significance, or 
saying it in other words, this discrepancy is just a result 
of an inaccurate estimation. Further applications may 
include the analysis of more complex parametrisations 
of the Kramers-Moyal coefficients and the influence of 
higher order Kramers-Moyal coefficients, thereby offer- 
ing new insights in the complexity of turbulence. 
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is computed, 

m(k){q) ■= (A.l) 

d{q{k)) + gfk)iQ - 9(fc)) + - <l(k)fB^k){<l - q{k)), 

where (7(i.) denotes the gradient of d{q) with respect to 
q and -B(fe) is a limited-memory BFGS approximation to 
the Hessian. 

Because the following steps have to be repeated for 
each (fc), the index (fc) is omitted for these steps if it does 
not change the meaning. As a second step a set of active 
bounds has to be found using the gradient projection 
method. The projection of an arbitrary point q onto the 
feasible region is defined by 

{Li qi < Li, 

q, if q^e[L„U,], (A.2) 
U, qi > Ui. 

Therefore a piecewise linear path q{s), which is the pro- 
jection of the steepest descent direction at the starting 
point q^ onto the feasible region, defined by Eq. (|A.2[) . is 
denoted by 

q{s)^P{q^ -sg,L,U). (A.3) 

As a third step the generalized Cauchy point q'^, which 
is defined as the first local minimizer of the function 
m(^kj{q{s)) on the piecewise linear path q{s), is computed. 
The components of which are at their upper or lower 
bound, U or L, comprise the active set A{q'^) of variables. 

As a fourth step the following quadratic problem over 
the subspace of free variables is considered. 

min{m(fe)(g)|(7i g," V i G Aiq")}, (A.4) 



subject to L < q <U M i ^ Aiq" 



(A.5) 



APPENDIX: L-BFGS-B ALGORITHM 

For the optimised estimation of the Kramers-Moyal 
coefficients we apply an iterative procedure, called L- 
BFGS-B algorithm |l[3i,[3|. Solving Eq. g]) as pro- 
posed in leads to a conditional pdf of first order 
Pnum{yi-i,n_i\yi,Ti, q) . In order to maximise the agree- 
ment between the numerical solutions p„ and the pdfs 
from the empirical data Prefi a measure is needed. Here 
dM{pn,Pref,q) IS uscd, which is defined in Eq. ([5]). The 
L-BFGS-B algorithm minimizes the non-linear function 
dMiPn,Pref,q), here d{q) is used as a short hand nota- 
tion, under the constraint L < q < U . L and U represent 
the lower and upper bound on q, respectively. 

Using the details provided above, an iterative proce- 
dure is started to find the vector q which minimizes the 
distance function d{q). As a first step of each iteration 
(k) a quadratic model m(^i^^{q) of d{q) at the iterate q(^k) 



Eq. (jA.4[) is solved approximately without the condition 
of Eq. (jA.sp using a direct primal method [3j|. The so- 
lution of this unconstrained problem is denoted by 5". 
Therefore the solution of the constrained problem can be 
written as 



9(fc+i) 



(fe) 



if i ^ J^, 



where 



and 



6* = a*S'' 



(A.6) 



(A.7) 



(A., 



max{a|a < 1,L, - q'^ < a6^ <U^- q^, i £ T}. 
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Z denotes the Nq x iV^ matrix of unit vectors (i.e. each 
column is a column of the identity matrix), that span 
the subspace of free variables at q^, where iV^ denotes 
the number of free variables at g^. T denotes the set of 
indices korresponding to the set of free variables. 

As a last step a line search between the current and 
the approximate minimizer q^i^^i^ is performed, which 
satisfies the strong Wolfe conditions 

d{qk+i) < d{qk) + CiXkgl{ql+i - Qk) (A.9) 



gl+iill+i -qk)\< C2\gl{qi+x - qk)\ (A.IO) 
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